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We consider the problem of testing the parametric form of the volatility for high frequency data. 
It is demonstrated that in the presence of microstructure noise commonly used tests do not keep 
the preassigned level and are inconsistent. The concept of preaveraging is used to construct new 
tests, which do not suffer from these drawbacks. These tests are based on a Kolmogorov-Smirnov 
or Cramer-von-Mises functional of an integrated stochastic process, for which weak convergence 
to a (conditional) Gaussian process is established. The finite sample properties of a bootstrap 
version of the test are illustrated by means of a simulation study. 
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1. Introduction 

The volatility is a popular measure of risk in finance with numerous applications in- 
cluding the construction of optimal portfolios, hedging and pricing of options. Therefore, 
estimating and investigating the volatility and its dynamics is of particular importance 
in applications and numerous models have been proposed for this purpose (see, e.g., 
Black and Scholcs [6], Vasicek [25], Cox et al. [9], Hull and White [17] and Hcston [16] 
among many others). Because the misspecification of the form of the volatility can lead 
to serious consequences in the subsequent data analysis numerous authors recommend 
to use goodness-of-fit tests for the postulated model (see, e.g., Ait-Sahalia [3], Corradi 
and White [8], Dette et al. [11], Dette and Podolskij [10] among others). 

In the present paper, we consider statistical inference in the case of high frequency data, 
where for an increasing sample size information about the whole path of the volatility 
is in principle available. However, in concrete applications the situation is more com- 
plicated because of the presence of microstructure noise, which is usually persistent in 
such data. This additional noise is caused by many sources of the trading process such 
as discreteness of observations (sec, e.g., Harris [14], [15]), bid-ask bounces or special 
properties of the trading mechanism (see, e.g., Black [5] or Amihud and Mendclson [4]). 
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Table 1. Simulated level of the test (1.1) for various choices of to and 6, where the true volatility 
function is a 2 (t,x) — 6 + (1 — 9)x 2 and the noise terms U are normally distributed with mean 
zero and variance uj 2 . In all cases, the sample size is given by n — 16384 
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While micro-structure noise has been taken into account for the construction of estima- 
tors of the integrated volatility and other related quantities (see, e.g., Zhang et al. [26], 
Jacod et al. [19] or Podolskij and Vetter [22], [21]), properties of goodness-of-fit tests in 
this context have not been investigated so far in the literature. 

Consider for example the problem, where the process {Zt\t^[o,i} is observed at the 
n time points 0,1/n, ...,1. Under the assumption that Z t = Xt = atdWt, Dette and 
Podolskij [10] propose to reject the hypothesis of a constant diffusion coefficient, that is, 
H : of = a 2 (t, X t ) — a 2 , whenever 



T n (Zi,. . . , Z n ) = y/n sup 

te[o,x] 

> Ci- a , 



J2k=l \Zk/n - Z( k -i)/ n \ 2 — tJ2k=l \ Z k/n ~ ^(fc-l)/n| 2 



^J2k=l \ Z k/n - Z( k -i)/ n \ 2 



(1.1) 



where C\- a denotes the (1 — a)-quantile of the supremum of a Brownian Bridge. Now 
consider the situation, where microstructure noise is present, which is usually modeled 
by an additional additive component, that is 

Z i / n = X i j n + U i / n , i = l,...,n, (1.2) 

where {C/j/ n | i = 1, . . . ,n} denotes a triangular array of independent random variables 
with mean and variance to 2 . In Table 1, we show the finite sample behaviour of the test 
(1.1) for the hypothesis of a constant volatility if of = <r 2 (t,x) = 6 + (1 — 9)x 2 (note that 
the case = 1 corresponds to the null hypothesis). We observe that the test keeps its pre- 
assigned level only in the case where u> is rather small. In most cases, the nominal level is 
clearly underestimated. On the other hand, the test is not able to detect any alternative. 
An intuitive explanation for this behaviour is that in the presence of microstructure noise 
the increments Zi/ n — Z(i_i)/ n = £/i/„ — Uu_iy n + O p (l/n) are dominated by the noise 
variables. This leads to inconsistent estimates of the integrated volatility as pointed out 
in Zhang et al. [26]. More precisely, a straightforward calculation shows that under mi- 
crostructure noise the statistic T n (Zi, . . . , Z n ) shows the same asymptotic behavior as the 
statistic T n (Ui, . . . , U n ), which converges weakly to y/ X/2sup te ^ ^ \B t \, no matter if the 
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null hypothesis is valid or not. Here B t denotes a Brownian bridge and A = E[{U k / n /uj) 4: ]. 
This means that in the presence of microstructure noise the test (1.1) has asymptotic 
level a if and only if A = 2. In all other cases, the test does not keep its preassigned level. 
Moreover, because the asymptotic properties under null hypothesis and alternative are 
the same, the test is not consistent. 

The present paper is devoted to the problem of constructing a consistent asymptotic 
level a test for a general parametric form of the volatility in the presence of microstructure 
noise. In Sections 2 and 3, we present the basic model and introduce a stochastic process 
which can be used to test parametric hypotheses about the form of the volatility in a 
noisy framework. Our main results are presented in Section 4, where we prove stable 
convergence of two such processes which form the basis of the proposed goodness-of-fit 
tests. Section 5 deals with the problem of testing nonlinear hypotheses for the volatility, 
whereas in Section 6 the finite sample properties of a bootstrap version of the new tests 
are investigated. All proofs of the results are presented in the Appendix. 

2. Testing parametric hypotheses for the volatility 

Suppose that the process X = (X t )t admits the representation 

X t =X + [ a s ds+ f a s dW s , (2.1) 
Jo Jo 

where W = (Wt)t is a standard Brownian motion and the drift process a and the volatility 
process a satisfy some weak regularity conditions, which will be specified later. Further- 
more, we assume that the process can be observed at discrete points on a fixed time 
interval, say [0,1]. 

Various assumptions on the structure of the volatility process have been proposed in 
the literature. Among such models, a large class involves the case where tr is defined to 
be a local volatility process, thus merely a function of time and state (see, e.g., Black and 
Scholes [6] , Vasicek [25] , Cox et al. [9] , Chan et al. [7] , Ait-Sahalia [3] or Ahn and Gao [2] 
among many others) . Because an appropriate modeling of the volatility is of particular 
importance for the construction of portfolios, hedging and pricing, many authors point 
out that the postulated model should be validated by an appropriate goodness-of-fit test 
(see, e.g., Ait-Sahalia [3] or Corradi and White [8]). In several cases, the hypothesis for 
the parametric form of the volatility is linear and one has to consider the following two 
situations: 

d 

H : o\ = o- 2 (t, X t )=J2 Orf (*, X t ) 

d 

H : a t = a(t,X t )=J26Mt,X t ) 
i=i 

where the functions o~±, . . . ,o~d (or a± , . . . , ad) are known and the parameters 6\ , . . . , 9d 
(or §i, . . . 7 8d) are unknown, but assumed to ensure a 2 (t,X t ) > (or a(t 7 X t ) > 0) almost 



Vi a.s. or 



(2.2) 



Vi a.s., 
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surely. Other models involve volatility functions, where the parameters enter nonlinearly 
(see Ait-Sahalia [3]) and the corresponding hypotheses will be considered later in Sec- 
tion 5, because the basic concepts are easier to explain in the linear context. 

Let us focus on the problem involving Hq for the moment, as the other testing problem 
can be treated in the same way. Dette and Podolskij [10] propose to construct a test 
statistic using an empirical version of the stochastic process 



N t = 



[\° 2 * da, # min = argmin^ 1 |<7 S 2 - J^foX.) j ds. 



Thus, one uses the L 2 distance to determine the best approximation to the unknown 
volatility process a 2 by a linear combination of the given functions a 2 , . . . ,a^. It can 
easily be seen that Hq is equivalent to N t = Vf a.s., and a well-known result from 
Hilbcrt space theory (see Achieser [1]) implies 

e min = D~ 1 C, thus N t = B° -BfD^C, (2.3) 

where 

B t °= f a 2 ds and B\ = [ a 2 (s,X s )ds for i = 1, . . . , d, 
Jo Jo 

and D and C denote adx d-matrix and a d-dimensional vector, respectively, with 

Dij= [ a 2 ( Sl X s )a 2 (s,X s )ds and d= f a 2 a 2 (s,X s )ds. 
Jo Jo 

In practice, one does not observe the entire path of the diffusion process X = (X t )t and 
it is therefore necessary to define an empirical version based on appropriate estimators for 
the quantities in (2.3). Let us briefly discuss the solution to the problem in the case, where 
X can be observed without further restrictions. Based on the decomposition above, Dette 
and Podolskij [10] propose to define an empirical version N t = — BfD~ 1 C, where one 
uses a Riemann approximation of each integral, choosing n\Xi~/ n — ^(fe-i)/ n | 2 as a local 
estimate for & 2 k _ 1 y n . Thus, 

Dij = ^E^Q'^/")^ 2 for i,i = l,...,d, (2.4) 

n /fc — 1 \ 
Ci = E CT M ' > X (k-l)/n ) \Xk/n - ^(fc-l)/n | 2 for « = 1, . . . , d, 



fc=l 



and the quantities B® and B t = (Bj, . . . , Bf ) T are given by 

B° = J2\Xk/n-X (k _ 1)/n \ 2 : Bi = -J2<7i(-,Xk/n) fori = l,...,d.(2.5) 
fc=i fc=i \ n ' 
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In this context, one can prove a (stable) central limit theorem for the process (Nt — N t ) t 
with the optimal rate of convergence n -1 / 2 , from which one may assess the distribution of 
suitable test statistics. For example, if d= 1, o-f(t,X t ) = 1, the hypothesis Hq reduces to 
the hypothesis of constant volatility considered in the introduction, and the Kolmogorov- 
Smirnov statistic (1.1) converges to the suprcmum of a Brownian bridge. 



3. Assumptions and definitions 

Since we are dealing with microstructure noise, we have to define a process Z = (Z t )t 
which represents the noisy observations. Typically one relates Z to the underlying Ito 
scmimartingale X through the equation Z t = X t + Ut for some noise process U . We 
restrict ourselves to the case of i.i.d. noise, in which the process U = {Utjt is independent 
of X and satisfies 

E[U t ]=0, E[U?]=u\ E[U?]<w (3.1) 

with a density having compact support. A precise definition of a proper probability space 
that accommodates Z can be found in Jacod et al. [19]. We assume further that Z is 
observed at times 0, 1/n, . . . , 1. As pointed out in the introduction, the corresponding test 
based on N t is not consistent for the hypothesis Hq in the presence of such microstructure 
noise. Thus, our aim is to define appropriate estimators for the unknown quantities in 
(2.3) in this noisy framework, from which a more adequate statistic Nt can be constructed. 
Note that in contrast to the previous setting we do not only need a local estimator for 
the unknown volatility function tr 2 , but also for the (unobservable) path of X itself. 

The natural approach in order to construct estimators for the volatility is to use in- 
crements of Z as in the no-noise case, even though a single increment does not provide 
sufficient information about a 2 . This problem can be overcome by applying the idea of 
pre-averaging, which was invented in Podolskij and Vetter [22] and is based on moving 
averages of Z . To this end, we choose first a sequence m n , such that 



= K + o(n- 1/A ) (3.2) 



for some k > 0, and a nonzero real- valued function g : K — > R, which vanishes outside of the 
interval (0, 1), is continuous and piecewise C 1 and has a piecewise Lipschitz derivative g' . 
We associate with g (and n) the following real valued numbers and functions: 

9? = 9 (—) , 9f = 9? - 97+i, V-i = f\g'(s)) 2 ds, ^ = (\g{s)) 2 ds, 

\ m n J Jo Jo 

se [O,1]h>0i(s) = / g'{u)g'(u- s)du, fc(s) = / g(u)g(u - s) du, 



■ l 

i,j = 1,2: ®ij= I (j) l (s)cj) j (s)ds. 



(3.3) 
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Finally, we define for an arbitrary process V the preaveraged statistic 

7£=X>?Ajk 3 .V; (3.4) 

where = Vj/ n — Vq-_i)/„. Due to the assumptions on g the pre-averaged statistic 

Z k reduces the impact of the noise, but still provides information about the increments 
of X (and thus locally about a). Precisely, we have 



Xk=oJ\ -r\ and Vk = Op\\ —\> ( 3 - 5 ) 





and by definition of m„ both terms are of the same order. This means in particular that 
statistics based on Z k are in general biased when used for volatility estimation, but it 
turns out that a larger choice of m n results in a worse rate of convergence. See Podolskij 
and Vetter [22] for details. 

An estimator for X k / n can be constructed in a similar way: We set 



— ^2 Z (k+j)/n, (3-6) 



and it is easy to see that this procedure reduces the impact of the noise variables around 
time — , but still provides information about the latent price X k / n , since the path of X 
is Holder continuous of any order a < 1/2. Also one observes essentially from (3.5) that 
the auxiliary sequence m n is chosen in the optimal way, giving the smallest possible size 
for the approximation error. 

As pointed out before, we need additional assumptions on the process X as well as on 
the given basis functions in Hq and Hq, respectively. Since the conditions on of and Oi 
are similar, we will restrict ourselves to the first case only. 

It is required that the functions of, . . . , a\ are linearly independent and that each of is 
twice continuously differentiate. Moreover, we assume that E[\ det(£>)|~^] < oo for some 
/3>0. 

Regarding the various processes in A, the assumptions are as weak as possible when 
testing for Hq. We simply have to ensure that the process in (2.1) is well defined, which 
follows if we assume that a is locally bounded and predictable and that a is cadlag (see 
Jacod and Shiryaev [20] or Revuz and Yor [23]). When working with Hq we propose 
additionally that the true volatility process a is almost surely positive and that is has a 
representation of the form (2.1) as well, namely that it satisfies 

t r t r t 



(T t =(TQ+ [ a' s ds+ [ a' s dW s + [ v' s dV s , 
Jo Jo Jo 



where a', a 1 and v' are adapted cadlag processes, with a' also being predictable and 
locally bounded, and V is a second Brownian motion, independent of W. Moreover, a is 
supposed to be caglag. 
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4. Goodness-of-fit tests addressing microstructure 
noise 

We start with the construction of a test for the hypothesis Hq again. Local estimators for 
the volatility can now be obtained from \Z k | 2 . but we have seen before that this quantity 
is not an unbiased estimate for <rf./ n and that it has a different stochastic order than 
the increments X^i n ~ ^(k-i)/n m the no-noise case. A corrected statistic (see Jacod et 
al. [19]) is given by 



*2/„ = ^ (\X\ 2 «- 1/2 ^) with # = i. £ i a^i 2 , (4.D 



k=l x 7 v 7 k=l 



i = l 

where the latter term is a consistent estimator for lu 2 , see Zhang et al. [26]. Mimicking 
the procedure from the no- noise case presented in Section 2, we set 

fc 

as well as 

[nt\-m n \nt\-,,*n s 

4° = i E *2/- - d 3 = £ E (4 - 3) 
fc=i fc=i v 7 

for z, j = 1, . . . , d We define at last the process 

N t = B^-Bj'D- 1 C, (4.4) 

which turns out to be an appropriate estimate of the process {Nt}te[o,i]- Our 
first result specifies the asymptotic properties of the process {-4n(t)}te[o,il with 
A n (t) = n^ 4 (N t -N t ). 

Theorem 1. If the assumptions stated in the previous sections are satisfied, the process 
(A n (t))t£[o,i] converges weakly in D[0, 1] to a mean zero process (-A(£))te[o,i] . Condi- 
tionally on J- the limiting process is Gaussian, and its finite dimensional distributions 
coincide with the conditional ( with respect to J-) finite dimensional distributions of the 
process 

UviliV^-BjD^h^Xv))- ( f ^As-BjD- 1 f ls h(s,X s )ds)\ , (4.5) 
I wo Jo /Jte[o,i] 

where V ~U[0,1], h(s,X s ) = (af(s,X s ), . . . ,aj(s,X s )) T and 

7. 2 = ^($22^ + 2$ 12 ^ + <i> 11 ^. (4.6) 
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We see from Theorem 3 in the Appendix that the asymptotics is only driven by 
and C. The error due to the estimation of B t and D is of small order, which explains the 
particular form of the limiting distribution. Note also that the rate of convergence n -1 / 4 
is optimal for this problem, since it is already optimal for the estimation of B® even in 
a parametric setting (cf. Gloter and Jacod [13]). 

In order to construct a test statistic based on Theorem 1, we have to define an appro- 
priate estimator for the conditional variance of the process {A(t)} t ^ jj, which is given 
by 

s 2 t = (\lds~2BjD- 1 f ^{s^X^As + BjD- 1 [\lg(s,X.)g T (s,X a )dsD- 1 B t . 
Jo Jo Jo 

Obviously, we use B t and D as the empirical counterparts for B t and D. In order to 
obtain estimates for the other random elements of sf, note that 7^ plays a key role in 
Jacod et al. [19] as well, where it is the (local) conditional variance in a central limit 
theorem for n 1 / 4 (S t ° - B°). Thus, in accordance to that paper we define 

4$ 22| _ 1/2 8/$i 2 ^22^1 ^^,2.2 

Z,A +n ' — — t; -t— ZJ u) 



n _! 4 / $ u 2$ 12 y 1 $22^1 \ » 4 

k 3 ^ vl vl V>2 / 

which is a local estimator for the process 7 2 after rescaling. Thus, we set 

\nt\-m n t 



g (t)= V r fe A / 7s 2 d s , 

fc=l ^ 

= X] r fc°f ,^(fe-i)/n ) / 7 s 2 ^ 2 (s,^ s )ds, 

fe=i V n /^o 

Inserting these estimators into the corresponding elements of gives the consistent 
estimator 



3? 



g (t) - 2B?D- 1 g(t) + BjD^GD^Bt, (4.7) 

where g(t) = (gi (t), . . .,gd(t)) T and G = (gij)fj=i- A consistent test for the hypothesis H 
is now obtained by rejecting the null hypothesis for large values of Kolmogorov-Smirnov 
or Cramer-van-Mises functional of the process {n 1 / 4 /^/ §t}te[o.i]- Note however that the 
distribution of this process is not feasible in general: even though for each fixed t the 
statistic n 1//4 7V t /st converges weakly to a standard normal distribution, the covariancc 
structure of the process typically depends on the entire (unobservable) process (X t ) t . For 
this reason, we will later use a bootstrap procedure to obtain critical values. 
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In principle, a similar approach can be used to construct a test for the hypothesis Hq. 
However, in this case things change considerably. Dette and Podolskij [10] restate this 
hypothesis as M t = Vt a.s., where 



M t = jT |<r a - jy? n ^(.s,X s )| ds, 



in / < a s - 6jO-j(s,X s ) > ds. 



(4.8) 



Obviously, we have an analogous representation as in (2.3), namely M t = i?° — RfQ~ 1 S, 
where 

Rt = I o~ s ds and R\ = I ai(s,X s )ds for i = 1, . . . , d, 
Jo Jo 

and Q and S* arc arfx <i-matrix and a c?-dimcnsional vector, respectively, with 

Qij = / ai(s,X s )aj(s,X s )ds and S t = a s ai(s,X s )ds. 
Jo Jo 

However, an appropriate definition of an empirical version of the form M t = R® — 
requires some less obvious modifications, because local estimators for o~ s are 
more difficult to obtain in this setting. Using a preaveraged estimator of the form | Z k \ 
again causes an intrinsic bias, but due to the absolute value (instead of the square as 
in the previous setting) its correction turns out to be impossible at the optimal rate. 
However, we can see from (3.5) that using in (3.2) a sequence of a larger magnitude than 
n 1 / 2 reduces the impact of the noise terms in Z k . This modification makes inference 
about o~ s possible, though resulting in a worse rate of convergence. To be precise, we fix 
some 6 > g and choose l n such that 

= p + o(n-( 1 /4+V2) ) 



,1/2+5 



for some p > 0. Using the sequence l n instead of m n , we define all quantities from (3.3) 
to (3.6) in the straightforward way. Next, we set 

- 1/4-5/2 

as a local estimator for cr fe / n , where \i\ denotes the first absolute moment of a standard 
normal distribution. In a similar way as before, 

1 / 'k - \ I 'k - \ - 1 ™~'" fk \ 

Qtj = ~ °iy->Xk/n)vjy-,Xk/n) and Si = - Y o-i\^-,X k/n ja k/n 
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as well as 

[nt\-l„ [nt\~l„ 



B$=- Y] CTfc/„ and -RJ = - Y] ^ ( -,X k/n I 

fe=i v 7 

for i,j = l,...,d. Finally, wc define B n (t) = n 1 / 4 " 5 / 2 (M t - M t ) for any i e [0,1] and 
obtain the following result. 

Theorem 2. // the assumptions stated in the previous sections are satisfied, the process 
(i? n (t)) t6 [o,i] converges weakly in D[0, 1] to a mean zero process (•6(£))te[o,i] • Condi- 
tionally on T the limiting process is Gaussian, and its finite dimensional distributions 
coincide with the conditional ( with respect to J- ) finite dimensional distributions of the 
process 

jviHV^tj^RjQ^h^Xv))- ( f jsds-RfQ- 1 f %h(s,X s )d s )\ , (4.9) 

\Jo Jo /Jte[o,i] 

where V ~ U[0, 1], h(s, X s ) = (ai(s, X s ), . . . , a d {s, X S )) T and 

Ml J ° W2 ' (4.10) 



f(u) = — (uarcsin(it) + \/l — u 2 — 1). 
The estimation of the conditional variance of the process }te[o,i]> 

r t 2 =/ f s d S -2RjQ- 1 [ f s h(s,X s )ds + RfQ' 1 f 7s 2 h( s , X S )5 T ( S , X s ) dsQ- 1 ^, 
Jo Jo Jo 

becomes easier in this context, as the order of l n is chosen in such a way that no charac- 
teristics of U are involved anymore. A natural estimator for a^ n becomes 

thus 



fto(*) = ^ f fc A / ' 
fc =i - 70 

h(t)= Tka '{ >-£(*-X)/n) / 7s^( s '^s)ds, 

fc=l ^ n /Jo 

^ij = X!^ fc5 ' 4 (^~77~"'^( fc - 1 )/") S J : y 7.?^(s^6')ffj(s,X s )ds, 
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and consequently a consistent estimator f 2 for the conditional variance is given by 

f 2 = h {t) - 2RfQ- 1 h(t) + RjQ- l HQ- r R t , (4.11) 

where h(t) = (fti(i), hd{t)) T and H = (hij)f A consistent test for the hypothesis 
-ffo is now obtained by rejecting the null hypothesis for large values of the Kolmogorov- 
Smirnov or Cramer-van-Mises functional of the process {w 1//4_(5//2 Mt/^t}te[o,i]- 

Note that one knows from previous work that it is neither necessary to define X 
to be an Ito semimartingale with continuous paths as in (2.1) nor to model the noise 
terms U as being independent and identically distributed to obtain similar results as in 
Theorems 1 and 2. In fact, for an underlying Ito semimartingale exhibiting jumps one can 
use bipower-type estimators as discussed in Podolskij and Vetter [21] in order to define 
an estimator closely related to B®. Moreover, it has been argued in Jacod et al. [19] that 
even for a noise process with a cadlag variance a similar theory as presented in this paper 
applies. 



5. Nonlinear hypotheses 

In this section, we briefly discuss the case of a nonlinear hypothesis 

H : of = a 2 (t,X t ) = a 2 {t,X t ,9) W a.s., (5.1) 

where 9 £ C M. d denotes the unknown parameter and a 2 satisfies some differentiability 
assumption. As before, we restate Hq as N t = Vt a.s., where N t is the difference between 
the true integrated volatility and its best L 2 -approximation from the parametric class. 
Therefore, we set N t = B° - B t (9 ) with B t from above and B t (0) = f*a 2 (s,X Sl 9)ds. 
We have 6*o = argmin^gQ f(9) with 

t 

2 J2i 



W)= / Wi-a 2 (s,X s ,9)yds. 
Jo 

In order to obtain some Nt, we use B® from (4.3) and need estimates for B t (9) and f{0). 
We set 

[nt] —m n 



B t {0) = - <r 2 (-,Xk/n,0) and 

n fc=i \ n ' r 

(5.2) 

- n — m n , / , \ \ 2 

- — - I _ _ / Ic \ 1 

fn(9) 



and with 9 = argmin e60 fn(9) we define N t = B® — B t (9). 

When deriving the asymptotic distribution of n 1 / 4 (iV t — Nt), the difference compared 
to the previous section regards only B t (9 ) — B t {9). In the following, we will give some 
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hints that explain why that discrepancy is actually quite small. In fact, we will show that 

T 



B t (9)-B t {9 ) 



dsWiWr'a^ + o^n- 1 ") (5.3) 



holds. Thus there is a one-to-one correspondence to the linear case, as the first two 
quantities are analogues of Bj and D^ 1 , whereas —f' n {0o) plays the role of C — C. 
Consequently, the process n 1/,4 (iV t — N t ) exhibits a similar asymptotic behavior as in the 
linear case. 

In order to prove (5.3), note from similar arguments as in the proof of Theorem 3 that 

B t 0)-Bt(9 o )= [ {a 2 {t,X u e)-G 2 (t,X u 9 Q )}ds + o p {n- 1 ^). (5.4) 
Jo 

Under common regularity conditions for nonlinear regression (see Gallant [12] or Sober 
and Wild [24]), 9q is the unique minimum of / and attained at an interior point of 0. 
It is easy to see that 9 — > 9o in probability in this case, and thus we can assume that 9 
satisfies f' n {&) = 0. This implies 

= fntf) = fn(fio) + ft(0){§-0 ) «> 9-9o = -(f;:(9))- 1 .f' n (9 Q ) 

for an appropriate choice of 9. We have 9 — > 9o in probability as well, and therefore it 
can be assumed that the d x d-dimensional matrix f^(9) is positive definite and that the 
difference \\fn(9) — fn(9o)\\ is small. Furthermore, fn(9o) takes the form 

where the [n — m n ) x d matrix S and the Hessian Hk are given by 



S = 



d of k 
d~9° 



,Xk/ n ,9 



d 2 „{k 



aild Hk = -Qff2 a 



.X 



k/i 



k— l,...,n — m n 

From the same arguments that lead to (5.4), we have fn(@o) = f"{9o) + O p (rt -1 / 4 ), where 



!"{<>») 



a 2 {s,X s ,l 



ds 



<7 2 -a 2 (s,X s ,9 )}—a 2 (s,X s ,t 



ds 



is positive definite. Note that the second term in this sum vanishes, when either the 
hypothesis is linear (since the Hessian is zero) or the null hypothesis is valid (since a 2 
equals a 2 (s,X s ,9o)). In these cases the matrix f"(9o) takes precisely the same form as 
D in the linear setting. In any case, f"(0o) is of order O p (l). 
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Regarding f' n (Oo), a similar calculation as given in the Appendix plus the definition of 
0o yield 



k=l 
l/4s 



=e 



2d , 



(7s de a ( s,Xs,t 



and thus f„(0o) is of order O p (n 1 / 4 ), just as C — C. We conclude that 9 - 9 Q = O p (n 
as well, and a Taylor expansion gives (5.3). 



6. Simulation study 

We have indicated in the introduction that the original test for a constant volatility from 
the noise-free model loses its asymptotic properties in the presence of noise. Unsurpris- 
ingly, for a smaller variance of the noise variables, the data look more like observations 
from a continuous scmimartingale and thus the test statistics behaves roughly in the 
same way as before, provided that the sample size is not too large. On the other hand, 
for a large variance of the error terms these are dominating, and thus the whole procedure 
breaks down even for small sample sizes. The same problem arises if the variance of the 
error is small but the sample size is large (see the discussion in the Introduction). We 
start with a further example simulating the level of the bootstrap test proposed by Dette 
and Podolskij [10] for a parametric hypothesis, assessing its quality for various sample 
sizes n and different variances lo 2 . 

Precisely, we have used that test for testing the hypothesis Hq: o- 2 (t,x) = 9x 2 , where 
b(t,x) = O.lx. The results are obtained from 1000 simulation runs and 500 bootstrap 
replications and displayed in Table 2 for various sample sizes and standard deviations 
lo of the noise process. We observe that for n = 256 and a (small) standard deviation 
of lo = 0.001 the test does roughly keep its asymptotic level, whereas it cannot be used 
at all when the variance becomes larger. Moreover, even if the variance is small but the 



Table 2. Simulated level of the bootstrap test proposed by Dette and Podol- 
skij [10], where the volatility function equals Ho: a 2 (t,x) = 6x 2 , but the ob- 
servations are corrupted with normally distributed noise having variance co 2 



n 

lo /a 


256 






1024 






0.025 


0.05 


0.1 


0.025 


0.05 


0.1 


0.001 


0.033 


0.062 


0.111 


0.333 


0.415 


0.512 


0.002 


0.158 


0.243 


0.324 


0.810 


0.862 


0.907 


0.004 


0.392 


0.518 


0.650 


0.993 


0.996 


0.998 


0.005 


0.497 


0.628 


0.742 


0.991 


0.994 


0.998 


0.01 


0.596 


0.754 


0.873 


0.987 


0.998 


0.999 
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sample size is increased, the test does not keep its pre-assigned level (see the results 
for oj = 0.001 and n = 1024 in Table 2). Thus, in practice the application of testing 
procedures addressing the problem of microstructure noise is strictly recommended. 

In the following section, we illustrate the finite sample properties of a bootstrap version 
of the Kolmogorov-Smirnov test based on the processes investigated in Sections 4 and 5. 
Since the stochastic order of |A"Z| is basically determined by the maximum of n~ x / 2 
and u (which are the orders of |A"X| and |A"C7|, respectively), we kept nui 2 =0.1024 
fixed in order to have comparable results for different sample sizes n. The regularisation 
parameters k and p were set to be 1/2 each. All simulation results presented in the 
following paragraphs are based on 1000 simulation runs and 500 bootstrap replications 
(if the bootstrap is applied to estimate critical values). 

For all testing problems discussed below, we have not used exactly the statistics N t 
and M t , but related versions accounting for finite sample adjustments. Following Jacod 
et al. [19], where it has been shown that finite sample corrections improve the behaviour 
of the estimate (and presumably of C as well) substantially, we have replaced the 
quantities ipi and in (3.3) by certain numbers tp" and 3>^, which constitute the 
"true" quantities for finite samples, but are replaced by their limits ipi and $y in the 
asymptotics. See Jacod et al. [19] for details. 

6.1. Testing for homoscedasticity 

In the problem of testing for homoscedasticity the limiting process ^4(i)tg[o,i] has an 
extremely simple form, when the null hypothesis of a constant volatility holds. In fact, 
the finite dimensional distributions of the process (-A(t))te[o,i] coincide with those of 
a rescaled Brownian bridge, thus [A n (t) / s t ) t £[o.i\ converges weakly to (-B t ) tg r 0jl ]. We 
have investigated the properties of the Kolmogorov-Smirnov test for different sample 
sizes n, where the noise satisfies U ~ J\f(0,uj 2 ) and the drift function is again given by 
b(t,x) = O.lx. A similar test can be constructed using Theorem 2, but the corresponding 
results arc omitted for the sake of brevity as the rate of convergence in this case becomes 
worse. 

In Table 3, we present the simulated level of the Kolmogorov-Smirnov test using the 
critical values from the asymptotic distribution. It can be seen that the asymptotic level 

Table 3. Simulated nominal level of the test, which rejects the null 
hypothesis of homoscedasticity for a large value of sup \ A n {t)/Bt\, 
using the critical values from the asymptotic theory. The variance 
of the noise process is defined by nco 2 = 0.1024 



n/a 


0.025 


0.05 


0.1 


256 


0.008 


0.022 


0.058 


1024 


0.007 


0.023 


0.062 


4096 


0.013 


0.029 


0.079 


16384 


0.017 


0.038 


0.077 
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of the test is slightly underestimated. This effect becomes less visible for a larger sample 
size, but even then it is still apparent. Note that these findings are in line with previous 
simulations on noisy observations and it is likely that they are due to the fact the rate 
of convergence for most testing problems is only n -1 / 4 . 

6.2. Testing general hypotheses 

For a general null hypothesis in (2.2), the distribution of the limiting process (^4(t))te[o,il 
depends on the path of the underlying scmimartingale (X t )t^[o,i] and on the volatility 
(ct)te[o,i]i and thus we cannot use it directly for the calculation of critical values. For 
this reason, we propose the application of the parametric bootstrap in order to obtain 
simulated critical values. First, we compute the global estimators Cj 2 and 8 — D^C as 
well as each n 1//4 iV t and s\ from the observed data. Under the null hypothesis iV t equals 
zero, and thus it is intuitively clear that the null hypothesis has to be rejected for large 
values of the standardised Kolmogorov-Smirnov statistic Y n =sup tg [ 01 ] In 1 / 4 ^/ s t \. 

In a second step we generate bootstrap data Z*j ^ = X*^ + , where the 
are realisations of the process in (2.1) with b s = and a 2 = a 2 (s,X s ) = J2k=i ^k^\{s,X s ) 
(corresponding to the null hypothesis) and each U*$ is normally distributed with mean 
zero and variance Cj 2 . Using these data, we calculate the corresponding bootstrap statis- 
tics Yn and use these to compute the quantiles of the bootstrap distribution. Finally, 
the null hypothesis is rejected if Y n is larger than its (1 — ev)-quantile. 

In order to investigate the approximation of the nominal level we consider the hypoth- 
esis of constant volatility and the hypothesis Hq: a 2 (t,x) = Ox 2 . The data is generated 
under the null hypothesis with drift function b(t, x) = O.lx and the rejection probabilities 
are depicted in Table 4. These results show that the bootstrap approximation works well 
even for a small n. In particular, we see that in the case of homosccdasticity the exact 
asymptotic test using the weak convergence of Y n to the supremum of a standard Brown- 
ian bridge is outperformed (compare with Table 3). In the case of testing, the parametric 
hypothesis Hq\ a 2 (t,x) = x 2 we observe a slight overestimation of the nominal level by 
the bootstrap test. 



Table 4. Simulated level of the bootstrap test based on the standardised 
Kolmogorov-Smirnov functional of (Nt) for various hypotheses. The vari- 
ance of the noise process is defined by raj 2 = 0.1024 



n/a 


1 






x 2 






0.025 


0.05 


0.1 


0.025 


0.05 


0.1 


256 


0.019 


0.046 


0.113 


0.03 


0.066 


0.118 


1024 


0.02 


0.049 


0.099 


0.034 


0.07 


0.119 


4096 


0.021 


0.04 


0.072 


0.022 


0.048 


0.090 
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Table 5. Simulated level of the bootstrap test 
based on the standardised Kolmogorov-Smirnov 
functional of (Alt) for a(t,x) = 0\x\. The variance 
of the noise process is defined by nuj 2 = 0.1024 



n/a 


0.025 


0.05 


0.1 


256 


0.040 


0.076 


0.136 


1024 


0.032 


0.057 


0.119 



As an example for testing the hypothesis Hq, we have chosen <j(t, x) = 9\x\ and investi- 
gated the properties of the analogues of Y n and Y„' from above, where we have replaced 
n V 4 7V t and §t by n x ^~ & l 2 M t and ft, respectively. In this case, we chose 8 = j, corre- 
sponding to l n = 0(n -3 / 4 ) and a rate of convergence n -1 / 8 . Note that in this particular 
situation there is no need for stating the hypothesis in terms of H$ as it is equivalent 
to a 2 (t,x) — 6\x\ 2 , but nevertheless it gives a reasonable impression on how well the 
bootstrap approximation works for testing hypotheses of the form Hq . 

We observe from the results in Table 5 that even though the rate of convergence 
in Theorem 2 is worse than in Theorem 1, there is no substantial difference in the 
approximation of the nominal level by the bootstrap test for both types of hypotheses: 
The nominal level is slightly overestimated, but in general the parametric bootstrap 
yields a satisfactory and reliable approximation of the nominal level. 

Finally, Table 6 contains the rejection probabilities of the bootstrap test under the 
alternative. The null hypothesis is given by Ho: a 2 (t,x) = 0\x\ 2 , and we discuss two local 
volatility alternatives, namely a 2 (t,x) = 1 and a 2 (t,x) = 1 + \x\, and one alternative 
coming from a stochastic volatility model is considered. For this case, we chose the 
Hcston model, that is, 

X t =X + f (/i-i/ s /2)ds + f a s dW t 
Jo Jo 

with Vt = i/o + 8 / (a — v s )ds + r y / vl' 2 dB Sl 
Jo Jo 



Table 6. Simulated rejection probabilities of the bootstrap test based on the standardised 
Kolmogorov-Smirnov functional of (Nt) for various alternatives. The data is simulated with 
a 2 (t,x) = 0\x\ 2 and the variance of the noise process is defined by nuj 2 = 0.1024 



alt 


1 












Heston 






n/a 


0.025 


0.05 


0.1 


0.025 


0.05 


0.1 


0.025 


0.05 


0.1 


256 
1024 


0.057 
0.170 


0.128 
0.230 


0.237 
0.329 


0.073 
0.224 


0.152 
0.326 


0.263 
0.465 


0.722 
0.975 


0.870 
0.980 


0.941 
0.985 
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where v t = of and Corr(W, B) = r\ and the parameters were chosen as /x = 0.05/252, <5 = 
5/252, a = 0.04/252, 7 = 0.05/252 and p = -0.5. 

We observe from the results depicted in Table 6 that the bootstrap test indicates in 
all cases that the null hypothesis is not satisfied. It is also remarkable that it is more 
difficult to detect the local volatility alternatives than the one coming from the Heston 
model. In the latter case, the rejection probabilities are extremely large even for a small 
sample size, contrary to the first two situations. 

Appendix: Proof of Theorem 1 

We will only prove the Theorem 1, as similar methods show Theorem 2 as well. We start 
with a typical localisation argument, which allows us to assume that several quantities 
arc bounded. Recall first that a and a arc locally bounded by assumption, from which 
is follows that X is locally bounded as well. Thus we can conclude along the lines of 
Jacod [18] that we may assume without loss of generality that each of these processes 
is actually bounded. Since further each of is continuous and because U has a compact 
support, we may conclude that both (s,X t ) and (s,X k / n ) (for arbitrary s, t, k and n) 
are living on a compact set, and thus af{s,X t ) and o~f{s,X k / n ) are also bounded, the 
latter one uniformly in n. Similar results hold for the first two derivatives of of as well 
as for any of the functions <7i . Constants are denoted by K throughout this section. 

The proof of Theorem 1 is based on several preliminary results, and we start with 
two results determining the rate of convergence of the quantities B\ — B\ and — Dij 
defined in (2.5) and (2.4), respectively. The following result ensures that the (conditional) 
variance in a limit theorem for N t — N t will not depend on B\ and , since the rate of 
convergence is n -1 / 4 . Thus, we will focus in the following on the behavior of and B®. 

Theorem 3. Under the assumptions from Section 3 we have 

Bl-B^Opin- 1 / 4 ) fori=l,...,d, 

(A.I) 

Aj - Dij = o p (n 1/4 ) for i,j = l,...,d, 
where the first result holds uniformly with respect to t G [0, 1]. 

Proof. For a proof of the first estimate, we use for a fixed index i the decomposition 
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Regarding the first term in this sum, note that 

- X k/n = V U {k+j)/n + / a u dW u + O p (n-^ 2 ) 

m n ~[ \ Jk/n / 

and thus Xk/n ~~ Xk/n = O p (n -1 / 4 ). A Taylor expansion and boundcdness of the second 
derivative of the function a 2 give 

\nt\-m n . . . . \\ 1 L™*J~ m " 

5 E (•f(iv)-j(i.^))-; E ^,.+°,(»-" 2 ) 

fe=l V V 7 V 77 fc=l 

with 

= ^ e ^ j ^m/- + y fe/n * d ^ J ■ 

However, we have E[Ak, n Ai tn ] = 0(n _1 / 2 ) for arbitrary k and / as well as E[Ak, n Ak+i, n ] = 
for / > m n by conditioning on Frk+l)/n- This yields 



E 



£ E M =i E E s[A M ^ +l ,„]+o(^)=o(i 



^ k—l / J k—m n l— — m n 



which is small enough. For the second term in the decomposition of B\ — B\ it holds that 



fc=l V /JO 

/" fc/ " ( 2( k - 1 \ 2 

jJZi J{k-X)/n \ \ n / 

+ a?(s,X {k _ 1)/n ) - o?(s,X B )j ds + O p (?i- 1/2 ). 

By differentiability in both components and from a similar expansion as above the claim 
follows. The result on Djj — can be shown in the same way. □ 

The following result specifies the convergence of the finite dimensional distributions 
of the processes, which are used for the construction of {-ZVt}te[o,il . Below we use the 

notation G n — ^ G to indicate stable convergence of a sequence of random variables 
(G n ) to a limiting variable G, which is defined on an appropriate extension of the original 
probability space. For details on stable convergence see Jacod and Shiryaev [20] . 
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Theorem 4. Define for any fixed t\,...,tk £ [0,1] the matrix Y,t 1 ,....t k (s,X s ) = 
jg£(s,X s )£ T (s,X s ) where £(s,X s ) = (1 [0 tl] (s), . . . , 1 [0 tfc] (s), h T (s,X s )) T . Then we have 

n^(Bl -Bl,..., B° tk -BlA-Ci,...,C d - C d f ^ f E, 1 /. 2 .,, (s, X.) W„ 

Jo 

where W is another Brownian motion, which is independent of the a-algebra T . 
Proof. Since w 2 — a) 2 = O v (n~ x l 2 \ one obtains 



1 n— m n , , v 1 n—m n , , , 



fc=l x ' k=l 



,Xi,/„ I — er 2 f —,Xuj„ ) \ai 



+ O p {n 

From similar arguments as given in the proof of Theorem 3 we find that the second term 
is of order o p (n -1 / 4 ) and thus asymptotically negligible as well. Therefore, we are left 

to focus on Fi n = ^Y^kZT™ °f (n'^fc/n^fc/n - Due *° ^ ne dependence structure of the 
summands in F^ n it will be convenient to use a "small-blocks-big-blocks" -technique as in 
Jacod et al. [19] in order to prove Theorem 4. To this end, we choose an integer p, which 
eventually goes to infinity, and partition the n observations into several subsets: We define 
bkip) = k(p+l)m n and Ck(p) = k(p+l)m n +pm n and denote by j n (p) the largest integer 
k such that Chip) <n — m n holds. Moreover, we use the notation i n (p) = (j n (p) + 
and introduce for each < k < j n {p) and any p the following random variables: 



e fc (p)-l 

G(k,P)l = -Vi — — ,X bk{p)/n \ °k/n> 



j=bk (p) 
&k+i(p)-l 



G(k,p)2 = ^af(^^-,X Ck(p)/n ) & l/r 



j=Ck (p) 



The remainder terms from i n (p) to n — m n are gathered in some G(p)^. Note that each 
of these quantities depends on i, although it does not appear in the notation. 

The main intuition behind these quantities is that the terms G(fc,p)™ are defined 
on non-overlapping intervals, which means that the intervals on which each Zj within 

G(k,p)i lives are disjoint from those of any ~Z™ within any other G(Z,p)™. This is sufficient 
to ensure some type of conditional independence, which will be used in order to prove 
Theorem 4. The variables G(k,p)2 and G(p)^ are filling the gaps between G(fc,p)™ and 
G(l,p)i and can be shown to be asymptotically negligible. 

An important tool will be the following decomposition of \Zj | 2 . We set 

rj/n+s / j\ rj/n+s / -\ 

= g n [u--)a u du+ g n [u )a u dW u , 

Jj/n V 71 J Jj/n V n J 
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and obtain by an application of Ito's formula 

iz;i 2 hx;i 2 + |f;i 2 +2x;i7; 

U+m n )/n / -\ 

V s9n[s- J -\a s &s + 2 



j/n 

(j+m n )/n 



{j+m n )/n 



j/n 



j/n 

2v: 



-t)<T 2 s d S +\u;\ 2 +2Tr; 



V B 3 9n\a-^)<T.dW a 

(j+m n )/n 



j/n 



9n ( s - - ) a s ds 



(J+m n )/n 
j/n 



g n [ s - -)<t s dW s 



X>(j)n 



1=1 



where the last identity defines the quantities D(j)? in an obvious manner. 

For bk(p) < J ' < Cfe(p) we introduce approximations for the quantities DfJ)^ and D(j)g , 
namely 



b k (p)/n 



(j+m n )/n / fj/n+s 



j/n 



j/n 



J 



J 



D{k,j,p)^ = 2a bkip)/n U j 



(J+m n )/n 



j/n 



fcU-- )dW u \g n [s- J - )dW s , 



g n [s - J - )dW t 



Additionally, we set H(k,p) n = af(^-,X bk(p)/n )Y(k,p) n , where 

c fc (p)-l 



Y(k,p) n = — n- 1 ' 2 V \D(k,j, P )% + D(k,j,p)%+ [D{ 3 yi-n- 1 / 2 ^ 2 

V J=bk(p) k V 

Finally, we define 

x(p)k= E \( SU P \a s -a t \ + \a s -a t \) \Fb k ( P )/ 

Ly s,t£[b k (p)/n,c k (p)/n] > 



(A.2) 



1/2 



The main part of the proof of Theorem 1 are two auxiliary results which specify the 
asymptotic properties of Fi n . 



Lemma 1. We have 



Sjrz(p) 



j-n (p) 



hm limsupn 1 / 4 £ (G(fc,p)i + G(fc,p)?) + G(p)J - C, - ^H(k,p) =0. 



fe=0 



fc=0 
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Proof. The proof goes through a rather large number of steps and makes extensive 
use of the decomposition in (A. 2). We will show first that the influence of the random 
variables D(j)™ and D(j)2 within G(k,p)i (and analogously for G(k,p)2 and G(p)^) is 
asymptotically negligible, that is 



in (p) 



lim limsupn ^ 4 }^ a\ 



P ^0° n— too 



k=0 



bkip) 



Cfe(p)-1 

>*Mp)/») E (D(ffl+D(j)%)=0. (A.3) 

j=bk{p) 



For a proof of (A.3), assume without loss of generality that bk{p) < j ' < c/c(p)- One obtains 



J D(j)" = 2a &fc(p)A 



0'+Tt»)/n / pj/n+s 



9n \ U 



.)/<■ 



cr u dW u )g n (s 



ds 



{i+m n )/n / ri/n+s 



i/i 



9n[u a u dW u }g n [s (a s - a bk ^ p y n ) d 



O, 



and from the martingale property of a stochastic integral with respect to Brown- 
ian motion and the Cauchy-Schwarz inequality we derive that jj 7 ) 



b k (p)/n\ 



< 



Kn- 3 / 4 x(p)k- Thus, with the notation S(k,p)^ = E^fp) ^0')? we conclude 

|^[<S(fc,p)?|^(p)/„]| < Kpn-^x&l and £[(<5(fc,p)?) 2 | -F Mp)/ „] < ^n" 1 / 2 , 
and for k > I it follows 
'b k ( P ) 



El a 



> X b k (p)/n )0-i 



bi(p) 



X bl ( P )/ n )5(l,p)^E[S(k,p)^\J- bk(p)/T 



<Kp 2 n^E[x{j>)l\. 
Since j n {p) is of order n 1 / 2 /p, we obtain 



E 



]n (p) 



- 1/4 E ^ 



fe=0 



hi?) 



bk(p)h 



Cfe(p)-1 



J E 



i=b k (p) 



in(p) 



KKlpn-^+^P^EMpTk] 



k>l 



From Lemma 5.4. in Jacod ef a/. [19] it follows that lim n _>oo ?i -1 / 2 X)i=i ^[x(p)k\ = 
for any p, which gives that the first term in the sum (A.3) converges to 0. The second 
term in (A.3) converges to zero from the independence of X and U and a standard 
martingale argument. 



22 



M. Vetter and H. Dette 



The next step is devoted to the analysis of the term D^)^. We prove 



jn (p) 



lim limsupn 1//4 a\ 



p— >oo n _^ 00 

as well as 



\ n 



Cfc(p)-1 

.*Mp)/») E TO)S--D(*»J,P)2) = (A.4) 

3=bk(p) 



jn (p) 



lim lim sup n 4 / 4 



E< 

fe=0 



X, 



lim limsupn 1 ^ 4 cf 



c k (p)/n 



) -^-i n (p)/n 



b k+1 (p)-l 

E D 0')a=0, (A.5) 

n— 

E I»(j)5=0. (A.6) 

j = in(p) 



Set 6fc(p) <j< Ck(p) again. A martingale argument as before allows us to focus on 

"(j+«i„)/n / rj/n+s 



g n [u - - I <*u dW u ) g n I s - - ) a s AW, 



J 



only. We have E[D"(j)$\F bk(p)/n ] = and E[\D"(j)^D"(l^\\T bk{p)/n ] < Kn~\ thus 
(A.6) follows easily. For (A.5), note that E[(J^^ ) ~ 1 D"(j)% f] < A, which gives (recall 
the definition of j n (p) , bk (p) and Ck (p) ) 



jrz (p) 



-1/2 



E* 



fc=0 



Ckip) 



X, 



Cfc(p)/n 



^&fc + l(p)-l 

E D "^)\ 

\ i=Ck(p) 



<Kn' 1 ' 2 - 



,1/2 



P 



converging to zero as p tends to infinity. We are thus left to prove 



lim limsupn 

p— >OG ^j—^oo 



(p) 

E< 

fc=0 



Cfe(p)-1 



E (^'(i)2-£(fe,j,p)£) = o. 



j=bk(p) 



This time, we have E[D"(j)™ - D{k,j,p)^\J : l 



b k (p)/n\ 



and 



n\2 



Thus, 



in (p) 



,-1/4 



E^ 



fc=0 



fcfc(p) 



>^Mp)A 



c fe (p)-l 

E TO)S--D(fc,J,p)5 

j=b k (p) 



jn (P) 

< AVn" 1 / 2 E E[(x(p) n k ) 2 

k=0 
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and with a similar argument as in the proof of (A. 3) we are done. Proving that D(j)2 
can be replaced by D(k,j,p)Q works analogously, thus we finish the proof of Lemma 1 
showing 

6fe+i(p)-l \ 



■of 



Chip) 



.*<*&,)/») E D U)2) ( A - 7 ) 



J=*» (p) ' J 



We start with the following proposition: 



lim limsupn 1/4 < V / a H > X b h M/n ^ds 



k=0 \Jb k (p)/n 

rbk+l(p)/n 

Ck(p)/n 



af(^M,X Ck(p)/n yids^ (A.8) 

^^(^x^^d.Va-Uo. 



As in the proof of Theorem 3, we have 



■^-o-?(s,X bk(p y n )( [ C7„dW u ]+Op 

°V \Jb k (p)/n J 



( pm n 
' V n 



thus 



rch(p)/n / fb k (p) \\ 
/ of (s, X s ) - of ,X bk(p)/n a s ds 

Jb k (p)/n V V 11 J J 



t-c k (p)/n 
, b k (p)/n 

5'(k,p)% + 8"(k,p)2 + O p 



2 2 

p X 



where 



^(Ms =<&,)/» / —af{s,X bkip)/n )[ dW u )ds 

Jb k (p)/n Oy \Jb k (p)/n J 



(A.9) 



(A.10) 
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and S"(k,p)^ is defined implicitly by equation (A. 10). We obtain 



lim lim sup n ' E 



f3-n.(p) \ 



fc=0 



= 



from the usual martingale argument and also 

o-n (p) j'« (p) 

lim limsupn 1/4 V M|<f''(A,j>)?|] < lim limsupifp 3/2 n- 1/2 V Mx(p)?]=0 

p->oo ^ p^oo f—< 

k—U k—0 



as before. The corresponding results for the other summands in (A. 8) can be shown 
analogously. 

To finish the proof of Lemma 1, we have to show 



lim lim sup n 1 ^ 4 ^ ^ I °f 



p^oo n^oc 



k=0 



bk(p) 



b k (p)/n 



Ch(p)-1 
j=b k (p) 



c fc (p)/n 



erf ds 



bk(p)/n 



Ckip) 



X, 



Ck(p)/n 



W 2 Jr w W 



ds 



7 -^Q n (p) A 



/ . n-m 71 -i 

W 2 .Jfcp) W 



ds 



0, 



The last term is negligible, and the main idea for the tedious proof of the remaining 
terms is to fix k for a moment and to prove a representation of the form 



Cfe (p) 1 

J-n- 1 / 2 y Dm - / k 



bk(p) 



al ds 



(A.ll) 



for a suitable function h njP (s), using the definition of D(j)"^. A similar expression can be 
found for the sum from Ck(p) to bk+i(p) with some h„ tP (s). A careful computation shows 
that hn tP (s) is either close to one (for s in the center of the corresponding interval) or 
that hn jP (s) and h n ,p{s) sum up to one (on its boundary). Then a Taylor expansion as 
in the proof of (A. 8) gives the result. □ 
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Lemma 2. We he 



lim limsupn 1 / 4 ^- ^ (G(fc,p)i + G(k,p)%) + G(p)J } = 0. 



p— >oo n _^ 00 



fe=0 



Proof. Without loss of generality, is suffices to show 

Jn(p) Cfc(p)-1 



lim hm sup n V 4 [af(s,X s ) — a^ 

rl ^°° fe= j=b h (j>) 

= 0. 



bk(p) 



i X b k {p)/n 1^1 -« 



-1/2^1 2 



The proof of this claim is tedious again. Essentially one simplifies the expression above 
by the Taylor expansion from (A. 9) and a similar decomposition as in (A. 2) for \Zj | 2 
and discusses each term separately. □ 

Note that we have completely analogous results for a decomposition of — B®. Thus, 
we end up with 

lim limsupn 1/4 <^ (4° - B°) - £ Y(k,p)l {ck(p)/n < t} \ = 0, (A.12) 



p->°o n->oo 



In (p) 



fc=0 



lim limsupn 1 / 4 ] (C, - - £ a? ( ^M,X 6fe(p)/ „ }> = 



where Y(k,p) was defined in (A. 2). Since 

nE[{Y{k,p)f\T bk{p)/n } =^7 b 2 fc (p)/„ + o P (l) and £7[K(fc,p)| .F Mp)/n ] = 
as in Jacod et al. [19], we conclude 



3n (p) 



lim lim n 1/2 ^ £[y(fc,p) 2 l {c)c ( p)/ll < tiA t 3} |J"b fc ( p )/„] 



p— > oo n— »oc 



fe=0 



in (p) 



lim lim n 1 / 2 

»— >oon— >oo ^ — ' 



p-> 



fc=0 



y(fc,p) 2 l {cfc(p)/ „< ti} er 2 



,A b 



6fc(p)/n I \Fb k (p)/n 



ls 1 lo,t i )(s)(^(s,X s )ds, 
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On (p) 

lim lim n 1/2 V E 

p— ¥00 n— >oo ' J 

= f 7 2 s aUs,X s )a*( S> X s )ds. 
Jo 

Theorem 4 follows now from Theorem IX 7.28 in Jacod and Shiryaev [20], since the 
missing conditions can be shown in the same way as in Jacod et al. [19]. □ 

The convergence of the finite dimensional distributions follows from the delta method 
for stably converging sequences, since we have 

n^(N tl - N tl , . . . , N tk - N tk ) T ^\Y [ E, 1 /. 2 . ,. tk (s, X.) dW s , 

Jo 

where the k x (d + fc)-dimensional matrix Y has the form 

Y = (I kxk -Y*), Y* = (B T t D- x ■•• B T t D~ v ) T . 

A straightforward calculation shows that the conditional covariance coincides with the 
one of the finite dimensional distributions of the process defined in (4.2). We are left to 
prove the tightness of the process n 1 / 4 (iV t — N t ) 7 and this can be done by an application 
of Theorem VI. 4. 5 in Jacod and Shiryaev [20], using the boundedness of the processes 
involved as well as E[\ det(D)| _/3 ] < 00. □ 
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